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Endoplasmic reticulum aminopeptidase 1 (ERAPl) is an essential component of the immune system, 
because it trims peptide precursors and generates the N-termini of the final MHC class I-restricted epitopes. 
To examine ERAPTs unique properties of length- and sequence-dependent processing of antigen 
precursors, we report a 2.3 A resolution complex structure of the ERAPl regulatory domain. Our study 
reveals a binding conformation of ERAPl to the carboxyl terminus of a peptide, and thus provides direct 
evidence for the molecular ruler mechanism. 
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Antigen processing is an integral part of cell-mediated immune surveillance and responses that involve 
MHC -restriction and T cell recognition 1 . Processed antigenic peptides, when presented on cell surface 
by MHC molecules to T cells, serve as identity tags to be monitored and responded to by our immune 
system. To be presented by MHC class I molecules, endogenous antigens need to be cut into 8 to 1 1 residues in 
length in order to fit into the MHC binding groove 2,3 . The precursors of class I antigenic peptides are generated 
mainly by proteasomes in the cytosol to have the correct C-terminus but with an N-terminal extension of several 
amino acids 4 . These peptide precursors are then transported into the lumen of the endoplasmic reticulum (ER) by 
transporter associated with antigen processing (TAP), that is an MHC-encoded peptide transporter capable of 
importing both mature epitopes and N-extended precursors ranging in length from 7 to more than 20 amino 
acids 5 " 7 . TAP transports the longer precursors more efficiently than the mature epitopes 8 . Thus, many peptide 
precursors need to be further trimmed inside the ER to generate the final N-termini of mature epitopes 910 . 

ERAPl (ER aminopeptidase 1), also named ERAAP (ER aminopeptidase associated with antigen processing), 
is one of two ER luminal aminopeptidases that make the final N-terminal trimming of class I antigenic peptides 11 " 13 . 
A second enzyme is called ERAP2 or L-RAP (leukocyte- derived arginine aminopeptidase) 14 . These two 
enzymes are highly homologous and both are induced by interferon-y, a potent stimulator of MHC class I 
presentation. ERAPl is identical to a previously isolated enzyme called adipocyte-derived leucine aminopepti- 
dase 15 , whose immunological relevance was not previously recognized. As expected for a peptidase involved in 
generating MHC class I antigens, ERAPl is expressed in most cells 16 , and its expression level is strongly upre- 
gulated by interferon-y 11 " 13 . ERAPl strongly prefers peptide substrates that are 9-16 residues in length 12131718 , 
which corresponds to the lengths of peptides transported selectively by TAP 5 8 . The length preference allows 
ERAPl to rapidly trim N-extended peptides to 8-9 residues, but further cleavages occur much more slowly or 
cease completely 1213 ' 1718 , producing the optimal length of epitopes that are presented by most MHC class I 
molecules. Moreover, ERAPl catalysis is activated by an unusual allosteric mechanism, termed the "molecular 
ruler" mechanism that monitors the substrate length and the identity of the C-terminal amino acids 9-16 residues 
away from the N-terminal cleavage site 18 . Thus, the substrate specificity of ERAPl influences which peptides are 
trimmed and available for MHC class I presentation and is critical for immunological function. 

To gain insights into the molecular ruler mechanism, our group has been pursuing the structure-function 
studies of the ERAPl enzyme. During the preparation of this manuscript, two structures of ERAPl were reported 
by other groups 19 ' 20 . However, without a peptide bound in those ERAPl structures, and with relative low resolu- 
tions (2.7-3 A), the ERAPl mechanism of peptide length- and sequence- dependent activity remained elusive. We 
report here a 2.3 A-resolution structure of the ERAPl regulatory domain in complex with a peptide fragment. 
This complex structure allows a direct visualization of peptide binding to the ERAPl regulatory domain, and thus 
provides a structural basis for the molecular ruler mechanism. 
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Results 

ERAPl C- terminal domain proposed as the putative regulatory 
domain. There is indication that ERAPl has a modular organization 
for its molecular ruler mechanism: binding of peptide's C-terminus 
to a regulatory site, distinct from the catalytic site, allosterically 
activates its peptidase activity 18 . Insights into the structure of the 
ERAPl regulatory domain should elucidate its unusual length 
dependence and substrate specificity, as well as the mechanism of 
antigen processing. Based on sequence comparisons, we hypothe- 
sized that the unique C-terminal half of ERAPl contains the C- 
terminus binding groove for antigenic peptide precursors and acts 
as the regulatory domain for the molecular ruler mechanism. ERAPl 
is a zinc-containing metallopeptidase that is a member the "Ml/ 
gluzincin" family of peptidases 15,21 . It is most closely related to two 
other aminopeptidases of the Ml family: ERAP2 and P-LAP 22 , with 
43-49% sequence identity spread along the entire sequences. There 
was also evidence to suggest that ERAP2 trims peptides in a way 
similar to ERAPl 17 . These three enzymes recognize and cleave 
peptide precursors, and belong to the oxytocinase subfamily of the 
Ml peptidases 21 . However, sequence alignment with other members 
of the Ml family that have a broader range of substrates shows that 
homology is only clustered within the N-terminal half of ERAPl: the 
Ml homology region. In contrast, there is no significant sequence 
homology between the C-terminal half of ERAPl and other broad- 
substrate- range Ml members. Nonetheless, this C-terminal domain 
is highly conserved among the three oxytocinase subfamily members 
that all bind and trim peptide substrates 14 ' 21 , and thus is highly likely 
to harbor the regulatory site to sense and recognize peptide's length 
and sequences near its C-terminus. 

Structure of ERAPl C-terminal domain in complex with a peptide 
C-terminus. We have determined a 2.3 A resolution crystal structure 
of the ERAPl regulatory domain (a.a. 529-941) in complex with a 
peptide carboxyl-terminus, with Royst = 0.19 and Rf ree — 0.26. Other 
crystallographic data collection and refinement statistics are summa- 
rized in Table 1. This ERAPl regulatory domain is composed of two 
subdomains: a small beta-sandwich with two (3-sheets (residues 530- 
611), and a larger bowl- shaped alpha-helix domain with 16 ot-helices 
(residues 614-941) forming a concave surface (Fig. 1). The overall 



structure is similar to the corresponding domain of the full-length 
ERAPl structure in the closed form 20 , with root mean square (r.m.s.) 
deviations of 1.7 A for all main-chain atoms; main differences result 
from a slight hinge movement between these two subdomains. In the 
crystal, the C-terminus end of a peptide, an engineered His-tag tail 
from a neighboring molecule (-RMHHHHHH-C0 2 ), is docking into 
a groove on the concave surface of the alpha-helix domain. Inter- 
actions between ERAPl and the peptide mainly involve hydrophobic 
and van der Waals contacts, plus a few hydrogen-bond interactions 
with the carboxylate end of the peptide (Fig. lb, and see below). As 
oriented in Fig. la, this peptide C-terminus binding groove is facing 
towards the N-terminal catalytic site in an intact ERAPl protein with 
a large internal cavity in-between the N-terminal catalytic zinc site 
and the C-terminal regulatory domains 19 ' 20 . The location of the C- 
terminus binding groove, —29 A from the N-terminal catalytic zinc 
site (Zn in Fig. la), is consistent with the expectation for a C- 
terminus anchoring site of a molecular ruler to monitor the length 
and sequence of an antigen precursor of nine- to ten-residue in 
length. Longer antigenic precursors could be accommodated by 
bulging or zig-zagging in the middle of peptide into the central 
and widest portion (—30 A) of the large substrate cavity. 

Allosterical activation of the ERAPl aminopeptidase activity by 
histidine-containing peptides. ERAPl is likely to bind and process 
His -containing antigen precursors since histidine is an anchor or 
preferred residue at various positions of antigen ligands for many 
MHC class I molecules. Histidine had been reported to be an anchor 
residue at the P2 position of antigen peptides for HLA-B*3801, at the 
P2 position for HLA-B*39011, and at the P7 (PC-2, the 2 nd position 
N-terminal to the peptide carboxyl-terminal) position for mouse 
H Qa-2 23 . It is also a preferred residue at various positions of anti- 
gens, including the P9/PC carboxyl-terminal position of antigen 
peptides for HLA-B*2705, or the P8/PC-1 position for HLA-B8, 
HLA-B*3701, HLA-Cw*0401, or the P7/PC-2 position for HLA- 
A*3101, HLA-B8, HLA-B*3902 23 . In addition, HLA-A3 had also 
been shown to bind peptides with a histidine at the PC position 24 . 
Thus binding of the His-tag peptide to the ERAPl regulatory domain 
as observed in the crystal structure is likely to reflect a functional 
conformation of ERAPl 's substrate recognition. 




Figure 1 | Structure of the ERAPl C-terminal regulatory domain in 
complex with a peptide, (a) Overall structure of the ERAPl regulatory 
domain, with respect to the N-terminal catalytic domain. The p-sandwich 
and oc-helix subdomains of the regulatory domain are shown as a ribbon 
diagram and colored in blue and green, respectively. Bound peptide is 
shown as a stick model and colored by atom types. Location of the catalytic 
zinc site in the N-terminal domain is shown as a grey sphere, with its 
distance to the peptide C-terminus indicated in angstroms, (b) ERAPl 
interactions with bound His-tag peptide (sequence MHHHHHH-C0 2 ). 
Bound peptide is shown as a thick stick model and colored by atom types: 
green for main-chain carbons, yellow for side-chain carbons, blue for 
nitrogens, and red for oxygens. Side-chains of selected residues of ERAPl 
are also shown and labeled as a thin stick model in gray. Dotted lines denote 
hydrogen-bond interactions. 



Table 1 | Crystallographic data collection and refinement statistics 
ERAPl Regulatory Domain in complex with a peptide 
Data collection 

Space Group P2i 
Cell Dimensions: 



a, b, c (A) 


63.8,67.3,65.9 


«, P,y(L 


90.0, 110.2,90.0 


Resolution (A) a 


53.2-2.3 (2.4-2.3) 


l/sigma-l 


8.2 (2.3) 


Completeness (%) 

Rsym (%) b 


99.5 (99.7) 


1 5 (69) 


Structure refinement 




Resolution (A) 


53.2-2.3 


No. of reflections 


22,138 


Rcryst/ Rfree 


0.19/0.26 


R.m.s. deviations o 




Bond-lengths (A) 


0.05 


Bond-angles (°) 


2.1 


Ramachandran plot 




Most favored regions (%) 


97.3 


Additional allowed regions (%) 


2.2 


B-factors (A 2 ) 




Protein 


21.7 


Water 


24.7 



a Numbers in parenthesis refer to the outermost resolution bin. 
b Rsym = ^hki^i\lhkli-(lhkli) | /^hkfcikkii)- 
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Figure 2 | Allosteric activation of ERAP1 by His-tag peptides. Activation 
of L-AMC hydrolysis by a series of peptides with a histidine at the PC 
position. As negative controls, hydrolysis activities were measured in the 
absence of a peptide, or with a negatively charged peptide (EEEEGEG) that 
was considered to be a poor substrate for ERAP1 25 . Peptides YPILAGH and 
KEIVKKH are derived from two naturally processed antigenic peptides, a 
cytochrome P450 epitope and a HSP 86 epitope, respectively. The other 
three peptides, MHHHHHH, RMHHHHHH, and MHHHRHH, are 
derived from the bound C-terminal His-tag determined in the crystal 
structure. The basal activity in which L-AMC hydrolysis was measured in 
the absence of added peptide is indicated by the open bar and the 
horizontal line. Standard deviations of three separate experiments are 
indicated by error bars. 

More studies are needed to resolve some conflicting results 
reported for the substrate sequence preference of ERAP1. For exam- 
ple, Arg, Lys and His at the PC position were reported by Chang 
et al. 18 to be suboptimal substrates, yet Evnouchidou et al. concluded 
that Arg and Lys are the best anchor residues at the PC position 25 . To 
verify the functional relevance of the His-tag binding, we performed 
enzyme activation assays. It has been shown that peptides shorter 
than 8 residues are not long enough to be efficiently trimmed by 
ERAP1 19 . However, these short peptides can bind to the regulatory 
domain of ERAP1 and allosterically activate the hydrolysis of L- 
AMC (leucine-7-amido-4-methyl-coumarin) by the ERAP1 catalytic 



site. To demonstrate a functional binding of His-tag peptides, we 
monitored activation of L-AMC hydrolysis by a series of peptides 
based on the bound His-tag sequence RMHHHHHH. Consistent 
with previous studies 19 ' 25 , ERAP1 hydrolyzes L-AMC with a low basal 
level in the absence of peptide (open bar in Fig. 2), or in the presen- 
ce of a predominantly negatively charged peptide (the peptide 
EEEEGEG); such a negatively charged peptide was considered to 
be a poor substrate for ERAP1 25 . We then examined activation of 
L-AMC hydrolysis by peptides containing C-terminal sequences of 
two naturally processed antigenic peptides that have a histidine at the 
PC position 26 : T RYPILAGH (a cytochrome P450 epitope, underlined 
C-terminal fragment used in Fig. 2), and RRI KEIVKKH (a HSP 86 
epitope). In the presence of these two C-terminal fragments of anti- 
genic peptides, a significant increase of ERAP1 aminopeptidase 
activity on L-AMC over the basal level was detected, suggesting an 
allosterical binding to the regulatory domain. Similarly, a significant 
increase of ERAP1 aminopeptidase activity was also detected in the 
presence of the His-tag peptides MHHHHHH and RMHHHHHH. 
Activation was further increased by placing an arginine anchor near 
the C-terminus of the peptide MHHHRHH, which is consistent 
with ERAP1 specificity studies at this (PC-2) peptide position 25 
(see below). Altogether, these His -containing peptides can thus bind 
to the ERAP1 regulatory domain to allosterically activate the ami- 
nopeptidase activity at the ERAPl's catalytic zinc site. Since many T- 
cells reactive to self and prevalent antigens presented by MHC are 
eliminated during their development in the thymus to avoid auto- 
immunity 27 , small but significant effect from minor antigens, similar 
to those observed in Figure 2, could play critical roles for protection 
against foreign pathogens. 

Molecular modeling of an optimal binding peptide into the 
ERAP1 regulatory site. To analyze the peptide binding groove 
and potential pockets or sub-sites for recognizing substrate side- 
chains and C terminus, we modeled an Arg-Ala-Phe sequence into 
the last three positions (PC-2, PC-1, and PC) of the bound His-tag 
peptide (Fig. 3a). This tri-peptide model sequence was based on 
substrate specificity of ERAP1 at PC-2 and PC positions 25 ; in the 
crystal the PC-1 side chain points away from the binding groove and 




Figure 3 | Specificity pockets of ERAP1 for peptide's anchors, (a) Overlay of a modeled tri-residue peptide (yellow) onto the crystal structure of the last 
three positions of the bound His-tag peptide (green). The model tri-residue peptide Arg-Ala-Phe was built based on the last three positions (PC-2, PC-1, 
PC carboxyl-terminal) of the bound His-tag peptide, with minor adjustments for better fit (see text and Methods). Surrounding side-chains of ERAP1 
residues are shown and labeled, (b) The C terminus binding groove of ERAP1 is shown as solvent accessible surfaces colored by electrostatic potentials: 
from red to blue for negatively charged to positively charged areas. The model tri-peptide is shown as a thick stick model colored by atom types: green for 
main-chain carbons, yellow for side-chain carbons, blue for nitrogens, and red for oxygens. Side-chains of selected ERAP1 residues are also shown and 
labeled in white, (c) Schematic outlines of specificity pockets, with surrounding side-chains of selected ERAP1 residues shown and labeled. 
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thus was reduced to an alanine in the model tri-peptide. To better fit 
into the binding groove and sub-pockets, small adjustments were 
made (Fig. 3a): —0.9 A towards Glu831, and a side-chain rotation 
to allow an aromatic ring interaction between the PC Phe and 
Phe803. One pocket contains predominantly non-polar atoms 
located at the floor of the peptide binding cleft, surrounded by He 
681, Leu733, Leu734, Val 737, and Leu769, with Phe803 ready to 
make an aromatic ring interaction with the PC phenylalanine side- 
chain of the model peptide (Figs. 3a&b). Another deep pocket is 
located in close proximity to the PC-2 arginine side-chain, with 
the entry wrapped around by residues Leu769, Phe803, and 
Ser799, and tunnels into the negatively charged residues Glu802 
and Glu831. This E802/E831 pocket appears to be ideal for 
binding to an anchoring side- chain of arginine or lysine of 
antigenic precursors. As shown in Fig. 2, adding an arginine 
anchor in the PC-2 position of the peptide substantially increases 
the peptide's activation capacity. Meanwhile, the carboxylate end 
of the peptide ligand makes direct or water-mediated contacts 
with Tyr684, Lys685, Arg807, and Arg841, and is partially exposed 
to bulk solvents. Model building suggests that small adjustments 
could be made to allow extensions (e.g. with an amine or glycine) 
from the current C-terminus location, similar to the C-terminal 
extension of an MHC class I binding mode 28 . Most of the residues 
made up of the two binding pockets for peptide side-chain 
anchors are polymorphic among ERAP1, ERAP2, and P-LAP 22 , 
suggesting different binding specificities for these three enzymes. 
On the other hand, several residues in and around the carboxylate 
binding site are conserved among these oxytocinase subfamily 
members. These conserved residues, which include Tyr684 and 
Arg841, provide additional contacts to recognize a constant feature 
of antigenic precursors. In the crystal structure, Tyr684 make a 
water- mediated contact with the peptide carboxylate end whereas 
Arg841 makes a direct contact with the main-chain carbonyl of the 
PC-2 residue (Fig. lc) and a water-mediated contact with the peptide 
carboxylate end. 

Discussion 

The high resolution structure of ERAP1 /peptide complex provides 
detailed insights into the peptide C-terminus recognition and the 
molecular ruler mechanism. Structural comparisons between the 
open and closed conformations of ERAP1 suggest that a confor- 
mational change with reorientation of a key catalytic residue is 



involved in the molecular ruler mechanism 19 ' 20 . Adding the new 
peptide binding conformation reported here, we propose a 
detailed model for the ERAPl's molecular ruler mechanism to 
sense the substrate length and recognize the sequence near the 
peptide C-terminal end (Fig. 4). In the absence of a peptide 
anchored at the regulatory pocket, ERAP1 stays in the lower- 
activity open conformation, resulting in the basal level of L- 
AMC hydrolysis (Fig. 4a). This lower- activity open conformation 
could also account for some low level trimming of peptides down 
to 4 or 5 residues 4 , by cutting peptide's N-terminus without hav- 
ing its C-terminus in contact with the regulatory pocket. In the 
presence of a peptide ligand longer than 9 or 10 residues, ERAP1 
uses specificity pockets at its regulatory domain to anchor the 
peptide's side- chains at or near the carboxyl-terminus (Fig. 4b). 
This anchoring and recognition triggers conformational changes 
to activate the N-terminal catalytic center located —30 A away. As 
observed in Figure 2, short peptides can also bind to the regula- 
tory domain of ERAP1 and activate in trans the hydrolysis of L- 
AMC by the N-terminal catalytic site. Nonetheless, for a peptide 
precursor to be efficiently processed, it needs to be 9 or 10 resi- 
dues long to be able to simultaneously place its N- and C-terminal 
ends into the catalytic zinc site and the C-terminus docking pock- 
ets, respectively. Longer peptides could be accommodated by bul- 
ging or zig-zagging into the widest portion of the large substrate 
cavity. Thus, even though the architects of the binding grooves are 
different, ERAP1 and MHC class I molecules utilize the same 
strategies to bind a large repertoire of antigenic peptides, by bind- 
ing at both N- and C-termini of peptides with common feature 
and side- chain anchors, but allowing flexibility in sequence and 
length through bulging or zig-zagging in the middle 2 ' 3 . It is inter- 
esting to note that ERAP1 has dual specificities at the PC position 
of a peptide nonamer series: preferring either a positively charged 
(R, K) or a hydrophobic side-chain (F, V, M) 25 . This is likely to 
result from different specificity pockets involved in anchoring 
these two types of peptide anchors. As for the model peptide 
(Fig. 3b), the Phe803 pocket could bind a phenylalanine anchor. 
However, for a peptide precursor with an arginine at the PC 
position, it is plausible to insert this positively charged side- chain 
anchor into the Glu802/Glu831 pocket instead. For the latter 
group of peptides with a PC arginine, additional affinity could 
come from the conserved Arg841 to make direct contacts with 
the peptide's carboxylate end. 




Figure 4 | Proposed model for the molecular ruler mechanism, (a) A small substrate (L-AMC shown) cannot concurrently reach the catalytic site and 
the regulatory pockets. ERAP1 stays in the lower- activity open conformation and inefficiently hydrolyses L-AMC. This could also account for some low 
level trimming of peptides down to 4 or 5 residues 4 , (b) A peptide longer than 9 or 10 residues can reach the catalytic zinc site from the regulatory domain 
binding groove where its carboxyl-terminus is anchored. Triggered by the peptide anchoring into the regulatory pockets (lightening), ERAP1 changes 
into the higher-activity closed conformation and efficiently trims one residue from the peptide's N-terminal end. Slightly shorter and longer peptides take 
the less (red) and more (green) bulging paths, respectively. Allosteric activation of ERAP1 on L-AMC hydrolysis by a short peptide can also be achieved in 
trans (a pair of red slashes) by concurrently occupying the catalytic site and regulatory pockets. 
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Methods 

Protein expression and purification. Baculovirus cDNA encoding for human 
ERAP1 full-length protein or C-terminal domain was constructed with a C-terminal 
hexa-histidine tag according to the protocols of the manufacturer (Invitrogen). The 
presence of ERAP1 protein and the integrity of the purified recombinant bacmid 
DNA were verified by PCR. To express the protein, the bacmid DNA was transfected 
into Sf9 insect cells according to the manufacturer's protocols. The protein was 
expressed and harvested 48 hours after infection by adding the P4 recombinant viral 
stock into Sf9 insect cells with an MOI of 1 pfu/cell. Protein expression was confirmed 
by western blot using primary antibody against the hexa-histidine tag. 

Cell pellets were re-suspended in 50 mM NaH 2 P0 4 , pH 8.0, 300 mM NaCl and 
10 mM imidazole, and lysed by freeze-thaw cycles and sonication. The supernatant 
was loaded onto a Ni-NTA column and washed several times with 50 mM sodium 
phosphate buffer, pH 8.0, containing 300 mM NaCl, and 10-30 mM imidazole. The 
protein was then eluted with 50 mM sodium phosphate buffer, pH 8.0, containing 
300 mM NaCl, and 400 mM imidazole. Glycerol was added to the eluted solution to a 
final concentration of 16% (v/v), and then the concentrated protein was further 
purified through a Superdex 200 gel filtration column (Amersham Pharmacia) by 
FPLC system with a buffer containing 10 mM Tris, pH 7.5, 10 mM NaCl. A single 
peak for ERAP1 enzyme or C-terminal domain was collected and the protein was 
concentrated to 3-7 mg/ml for crystallization. 

Enzyme Activation Assays. Aminopeptidase activity was determined by measuring 
the fluorescence of 7-amido-4-methylcoumarin (AMC) released by hydrolysis of 
Leucine- AMC (L-AMC). Assays were performed at 25 °C in 200 ul of 50 mM 
Tris/HCl, pH 8.0, 0.1 M NaCl, containing 0.75 ug ml" 1 ERAP1 enzyme in the 
presence or absence of 100 uM peptides. Hydrolysis of 50 uM Leucine- AMC (L- 
AMC) was followed for 5 minutes and measured using an excitation wavelength of 
380 nm and an emission wavelength of 460 nm. Fluorescence intensities were 
calibrated using AMC as standard 12 . 

Crystallization, data collection and structure determination. Initial crystallization 
screening at room temperature used the hanging-drop vapor diffusion technique. 
Promising conditions were further refined at 4°C. To improve the crystal quality, 
micro-seeding method was used for final crystallization. The best looking crystal was 
formed above a well solution containing 100 mM Tris, pH 8.5 and 8% PEG8000 at 
4°C in 4 days. 

For data collection, the crystal was cryoprotected in solution containing 100 mM 
Tris-HCl buffer (pH 8.0) and 30% glycerol. X-ray data were collected using the 
beamline X29 at National Synchrotron Light Source (NSLS). The data was processed 
with the Mosflm 29 and the CCP4 suite 30 . 

The structure of ERAP1 C-terminal domain was determined by molecular 
replacement method using the ERAP1 intact protein structure (PDB code 2XDT) 20 
as the starting model, with residues 530-610 deleted and all remaining residues 
changed to alanines to reduce model bias. Molecular replacement was preformed 
with Molrep and refinements were performed with the Refmac program. 5% of 
the total reflection data was excluded from the refinement cycles and used to 
calculate the free Rfactor (Rf ree ) for monitoring refinement progress. Rigid body and 
subsequent restrained refinements and model building with COOT 31 led to the final 
crystallographic R wor klRfree of 19.1%/26.1% at 2.3 A resolution. The X-ray data and 
structure refinement statistics are shown in Table 1. A structural alignment of 
the ERAP1 C-terminal domain with corresponding domain of Tricon Interacting 
Factor F3 (TIFF3) 32 is shown in Supplementary Figure 1. All the figures were 
drawn using PyMOL (DeLano Scientific) and labels were added using Adobe® 
Photoshop. 

Modeling of the tri-residue peptide Arg-Ala-Phe. The modeled tri- residue peptide 
Arg-Ala-Phe is based on the last three positions (PC-2, PC-1, and PC) of the 
experimentally determined His-tag peptide. These three histidine residues were first 
mutated to Arg, Ala, and Phe, respectively. Side chains of the mutated tripeptide were 
then adjusted to avoid steric clashes with the ERAP1 structure, using the torsion 
option in Coot 31 . Additional adjustments were made to better fit into the binding 
groove and sub-pockets: —0.9 A towards Glu831, and a side-chain rotation of PC Phe 
to allow an aromatic ring interaction between the PC Phe and Phe803 (Fig. 3a). The 
modeled tripeptide in complex with ERAP1 domain was further refined with energy 
minimization protocol with 20 steps of Steepest Descent algorithm in GROMOS96 
force field using the SwissPdb Viewer program 33 . 

Accession code. Diffraction data and coordinates are deposited under accession 
code 3RJO in the Protein Data Bank. 
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